AITopics

2506.19992

Genre: Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Artificial IntelligenceJun-2-2025

multivariateGPT: a decoder-only transformer for multivariate categorical and numeric data

Loza, Andrew J., Kim, Jun Yup, Song, Shangzheng, Liu, Yihang, Sung, Joseph J. Y., Taylor, R Andrew, Shung, Dennis L.

Real-world processes often generate data that are a mix of categorical and numeric values that are recorded at irregular and informative intervals. Discrete token-based approaches are limited in numeric representation capacity while methods like neural ordinary differential equations are not well suited for categorical data or informative sampling and require augmentation to handle certain classes of trajectories. Here, we present multivariateGPT, a single architecture for modeling sequences of mixed categorical (including tokenized text) and numeric data. This is accomplished with an autoregressive sequence decomposition, embedding scheme, and loss function that extend the next token prediction task to likelihood estimation of the joint distribution of next token class and value. We demonstrate how this approach can efficiently learn to generalize patterns in simple physical systems and model complex time series including electrocardiograms and multivariate electronic health record data. This work extends the utility of transformer based models to additional classes of data.

large language model, machine learning, trajectory, (18 more...)

2505.2168

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.90)
Health & Medicine > Health Care Technology > Medical Record (0.55)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningAug-25-2024

ESG Rating Disagreement and Corporate Total Factor Productivity:Inference and Prediction

Li, Zhanli

ESG Rating Disagreement and Corporate Total Factor Productivity:Inference and Prediction Zhanli Li ESG rating disagreement can lead to a decline in corporate total factor productivity When faced with ESG rating disagreement, reactive green innovation by enterprises does not lead to improvements in total factor productivity. Abstract This paper explores the relationship between ESG rating disagreement and total factor productivity (TFP) based on data from Chinese domestic ESG rating agencies and financial data of A-share listed companies in China from 2015 to 2022. On one hand, the empirical results show that ESG rating disagreement reduces corporate TFP, a conclusion that is validated through multiple robustness tests. The mechanism analysis reveals an interaction effect between green innovation and ESG rating disagreement. Specifically, in firms without ESG rating disagreement, green innovation promotes the improvement of TFP; however, in firms with disagreement, although ESG rating disagreement may drive green innovation, this does not lead to an increase in TFP. The heterogeneity analysis indicates that this effect is more pronounced in non-state-owned, asset-intensive, and lowpollution enterprises.

esg rating disagreement, numeric data, quantile, (11 more...)

arXiv.org Machine Learning

2408.13895

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Banking & Finance > Trading (0.93)
Government > Regional Government > Asia Government > China Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceApr-13-2023

Self-Supervised Predictive Coding with Multimodal Fusion for Patient Deterioration Prediction in Fine-grained Time Resolution

Lee, Kwanhyung, Won, John, Hyun, Heejung, Hahn, Sangchul, Choi, Edward, Lee, Joohyung

Accurate time prediction of patients' critical events is crucial in urgent scenarios where timely decision-making is important. Though many studies have proposed automatic prediction methods using Electronic Health Records (EHR), their coarse-grained time resolutions limit their practical usage in urgent environments such as the emergency department (ED) and intensive care unit (ICU). Therefore, in this study, we propose an hourly prediction method based on self-supervised predictive coding and multi-modal fusion for two critical tasks: mortality and vasopressor need prediction. Through extensive experiments, we prove significant performance gains from both multi-modal fusion and self-supervised predictive regularization, most notably in far-future prediction, which becomes especially important in practice. Our uni-modal/bi-modal/bi-modal self-supervision scored 0.846/0.877/0.897

artificial intelligence, machine learning, prediction, (18 more...)

2210.16598

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (0.87)
Health & Medicine > Therapeutic Area (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

arXiv.org Artificial IntelligenceFeb-8-2023

Streaming Encoding Algorithms for Scalable Hyperdimensional Computing

Thomas, Anthony, Khaleghi, Behnam, Jha, Gopi Krishna, Dasgupta, Sanjoy, Himayat, Nageen, Iyer, Ravi, Jain, Nilesh, Rosing, Tajana

Hyperdimensional computing (HDC) is a paradigm for data representation and learning originating in computational neuroscience. HDC represents data as high-dimensional, low-precision vectors which can be used for a variety of information processing tasks like learning or recall. The mapping to high-dimensional space is a fundamental problem in HDC, and existing methods encounter scalability issues when the input data itself is high-dimensional. In this work, we explore a family of streaming encoding techniques based on hashing. We show formally that these methods enjoy comparable guarantees on performance for learning applications while being substantially more efficient than existing alternatives. We validate these results experimentally on a popular high-dimensional classification problem and show that our approach easily scales to very large data sets.

artificial intelligence, machine learning, vector, (19 more...)

2209.09868

Country:

Africa > Chad > Salamat (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > North Carolina (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.54)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

#artificialintelligenceDec-22-2021, 13:56:03 GMT

Modern Data Scientist

Modern Data Science with R is a comprehensive data science textbook for undergraduates that incorporates statistical and computational thinking to solve real-world problems with data. These days, you find thousands of job openings for the position of a Data Scientist. The AI and machine learning has surely taken this world by storm. Though AI was invented several decades ago, we started seeing its practical usefulness in just last few years. With applications as simple as predicting house prices to sophisticated ones like person detections in real time videos for surveillance, real time traffic monitoring and those based on textual data like ratings the hotels on the basis of their past customer reviews to areas like topic modeling, real time language translations and so on.

machine learning, modern data scientist, numeric data, (4 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)

#artificialintelligenceJun-30-2020, 13:11:30 GMT

How to Prepare Your Data - KDnuggets

It is rare that you get data in exactly the right form you need it. Often you'll need to create some new variables, rename existing ones, reorder the observations, or just drop registers in order to make data a little easier to work with. This is called data wrangling (or preparation), and it is a key part of Data Science. Most of the time data you have can't be used straight away for your analysis: it will usually require some manipulation and adaptation, especially if you need to aggregate other sources of data to the analysis. In essence, raw data is messy (usually unusable at the start), and you'll need to roll up your sleeves to get to the right place.

artificial intelligence, dataset, machine learning, (19 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceJan-8-2020, 06:37:37 GMT

End-to-End Machine Learning Course 2 Tensors

Tensor is just a multi-dimensional matrix. A tensor is usually a matrix of dimension 3 or higher. Scalar 1, a vector also known as a list or array [1,2,3], a two by two matrix [[1,2],[3,4]], tensor [ [[1,2],[3,4]], [[5,6],[7,8]] ]. A vector contains a bunch of scalars. A matrix contains a bunch of vectors. A tensor contains a bunch of matrices. You can check the data type of a variable in python using type(variable_name). In pytorch this will return the specific type of torch.tensor.

data structure, matrix, tensor, (9 more...)

Industry: Education (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.44)

#artificialintelligenceNov-18-2019, 23:37:54 GMT

Machine learning algorithms explained

Machine learning and deep learning have been widely embraced, and even more widely misunderstood. In this article, I'd like to step back and explain both machine learning and deep learning in basic terms, discuss some of the most common machine learning algorithms, and explain how those algorithms relate to the other pieces of the puzzle of creating predictive models from historical data. Recall that machine learning is a class of methods for automatically creating models from data. Machine learning algorithms are the engines of machine learning, meaning it is the algorithms that turn a data set into a model. Which kind of algorithm works best (supervised, unsupervised, classification, regression, etc.) depends on the kind of problem you're solving, the computing resources available, and the nature of the data.

algorithm, hyperparameter, learning, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)

#artificialintelligenceAug-26-2019, 10:03:29 GMT

AI Deep-Dive: Part 2 - A Quick Look at Graph Neural Networks

Both nodes and edges in a graph can contain attributes, for instance, in the maps example.

artificial intelligence, machine learning, social media, (18 more...)

Technology:

Information Technology > Communications > Social Media (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)